{
 "cells": [
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "# Pandas code cards\n",
    "## Try me\n",
    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ffraile/computer_science_tutorials/blob/main/source/Data%20Manipulation/exercises/Pandas%20code%20cards.ipynb)[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/ffraile/computer_science_tutorials/main?labpath=source%2FData%20Manipulation%2Fexercises%2FPandas%20code%20cards.ipynb)\n",
    "\n",
    "## How to use\n",
    "- Each card mirrors an A4 classroom prompt. **Predict first** (or discuss), then run the cell to check.\n",
    "- Detective cards show a buggy idea in Markdown; the code cell shows a **fixed** version.\n",
    "- Keep explanations short and schematic (*what* → *why*).\n",
    "\n",
    "### Turn Gemini into a coding tutor (no direct answers)\n",
    "Paste this in your first chat with Gemini to keep it in “tutor mode”:\n",
    "\n",
    "```markdown\n",
    "You are a **coding tutor** for Python in Jupyter/Colab. Follow the **course motto** “do not give up learning.”\n",
    "\n",
    "### Role & Goals\n",
    "- Use **Socratic guidance** and **test-first thinking** to help me solve problems myself.\n",
    "- Help me read errors, reason about state, and make small, safe iterations.\n",
    "\n",
    "### Strict Rules\n",
    "1) **Do not** provide full working solutions or paste complete functions/programs.\n",
    "   - You may show **tiny illustrative fragments (≤3 lines)** or **pseudo-code with TODOs**, but not a drop-in answer.\n",
    "2) Prefer **questions over answers**; offer **one small next step** at a time.\n",
    "3) When debugging, explain **what the traceback says**, give **2–3 hypotheses**, and propose the **smallest diff** in *plain English* first.\n",
    "4) Encourage **TDD**: ask me to write/assert a test, predict, run, and report outputs.\n",
    "5) Keep responses concise (≈120–150 words) unless I ask for a deeper explanation or code review.\n",
    "6) Ask me to **run code and share results**; adapt based on the output.\n",
    "7) If I request the full solution, remind me of the rules and offer a **higher-tier hint** instead.\n",
    "8) When I finalize an exercise, reinforce learning lessons and suggest additional exercises\n",
    "\n",
    "### Interaction Loop (use this structure)\n",
    "- **Restate goal:** what I’m trying to accomplish in one line.\n",
    "- **Diagnose:** key assumption to check or error to interpret.\n",
    "- **Hint (tiered):**\n",
    "  - Tier 1: Conceptual nudge (no code).\n",
    "  - Tier 2: Directed hint (identify line/construct to change).\n",
    "  - Tier 3: Pseudo-code with TODOs or a **1–3 line** pattern (still not a full solution).\n",
    "- **Next action:** one concrete step for me to try now.\n",
    "- **Ask back:** what to run/paste (output, test result, or traceback).\n",
    "\n",
    "### When reviewing my code\n",
    "- Comment on **correctness, clarity, naming, and complexity (big-O)**.\n",
    "- Suggest **tests** I’m missing (boundaries, empty cases, error paths).\n",
    "\n",
    "### Safety & Ethics\n",
    "- No secrets or private data in prompts.\n",
    "- avoid library functions/APIs unless I ask.\n",
    "\n",
    "Stay in tutor mode for the whole session.\n",
    "```\n",
    "\n",
    "## Code Cards\n",
    "1. Predict the output of this code:\n",
    "```python\n",
    "import pandas as pd\n",
    "data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}\n",
    "df = pd.DataFrame(data)\n",
    "print(df.iloc[0, 1])\n",
    "```\n"
   ],
   "id": "a669349e86dafe05"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "outputs": [],
   "execution_count": null,
   "source": "",
   "id": "2034110f982772d8"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "2. Predict the output of this code:\n",
    "```python\n",
    "import pandas as pd\n",
    "data = {'X': [10, 20, 30], 'Y': [40, 50, 60], 'Z': [70, 80, 90]}\n",
    "df = pd.DataFrame(data)\n",
    "print(df[['X', 'Z']])\n",
    "```\n"
   ],
   "id": "7656bd3be66abbb5"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "outputs": [],
   "execution_count": null,
   "source": "",
   "id": "96d12c633a013576"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "The next exercises use the dataframe ```df``` below, loading Diabetes data from a CSV file, and has the following columns:\n",
    "\n",
    "* Pregnancies: Number of times pregnant\n",
    "* Glucose: Plasma glucose concentration a 2 hours in an oral glucose tolerance test\n",
    "* BloodPressure: Diastolic blood pressure (mm Hg)\n",
    "* SkinThickness: Triceps skin fold thickness (mm)\n",
    "* Insulin: 2-Hour serum insulin (mu U/ml)\n",
    "* BMI: Body mass index (weight in kg/(height in m)^2)\n",
    "* DiabetesPedigreeFunction: Diabetes pedigree function\n",
    "* Age: Age (years)\n",
    "* Outcome: Class variable (0 or 1)\n",
    "* 268 of 768 are 1, the others are 0\n",
    "* Class Distribution: (class value 1 is interpreted as \"tested positive for diabetes\")\n",
    "\n",
    "The following code loads the dataset into a Pandas dataframe and shows the first 5 rows:\n",
    "\n"
   ],
   "id": "e2f1fc736b948087"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-12-25T18:44:56.558283Z",
     "start_time": "2025-12-25T18:44:55.813155Z"
    }
   },
   "cell_type": "code",
   "source": [
    "import pandas as pd\n",
    "df = pd.read_csv('https://raw.githubusercontent.com/ffraile/computer_science_tutorials/main/source/Data%20Manipulation/exercises/datasets/diabetes.csv')\n",
    "df.head()"
   ],
   "id": "7e43825c9dabc0c1",
   "outputs": [
    {
     "data": {
      "text/plain": [
       "   Pregnancies  Glucose  BloodPressure  SkinThickness  Insulin   BMI  \\\n",
       "0            6      148             72             35        0  33.6   \n",
       "1            1       85             66             29        0  26.6   \n",
       "2            8      183             64              0        0  23.3   \n",
       "3            1       89             66             23       94  28.1   \n",
       "4            0      137             40             35      168  43.1   \n",
       "\n",
       "   DiabetesPedigreeFunction  Age  Outcome  \n",
       "0                     0.627   50        1  \n",
       "1                     0.351   31        0  \n",
       "2                     0.672   32        1  \n",
       "3                     0.167   21        0  \n",
       "4                     2.288   33        1  "
      ],
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Pregnancies</th>\n",
       "      <th>Glucose</th>\n",
       "      <th>BloodPressure</th>\n",
       "      <th>SkinThickness</th>\n",
       "      <th>Insulin</th>\n",
       "      <th>BMI</th>\n",
       "      <th>DiabetesPedigreeFunction</th>\n",
       "      <th>Age</th>\n",
       "      <th>Outcome</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>6</td>\n",
       "      <td>148</td>\n",
       "      <td>72</td>\n",
       "      <td>35</td>\n",
       "      <td>0</td>\n",
       "      <td>33.6</td>\n",
       "      <td>0.627</td>\n",
       "      <td>50</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>85</td>\n",
       "      <td>66</td>\n",
       "      <td>29</td>\n",
       "      <td>0</td>\n",
       "      <td>26.6</td>\n",
       "      <td>0.351</td>\n",
       "      <td>31</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>8</td>\n",
       "      <td>183</td>\n",
       "      <td>64</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>23.3</td>\n",
       "      <td>0.672</td>\n",
       "      <td>32</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1</td>\n",
       "      <td>89</td>\n",
       "      <td>66</td>\n",
       "      <td>23</td>\n",
       "      <td>94</td>\n",
       "      <td>28.1</td>\n",
       "      <td>0.167</td>\n",
       "      <td>21</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "      <td>137</td>\n",
       "      <td>40</td>\n",
       "      <td>35</td>\n",
       "      <td>168</td>\n",
       "      <td>43.1</td>\n",
       "      <td>2.288</td>\n",
       "      <td>33</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "execution_count": 1
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "3. Complete the code below to calculate the average BMI of patients with diabetes (Outcome == 1) and without diabetes (Outcome == 0).\n",
    "```python\n",
    "df_diabetes = df[?] # Filter rows where Outcome == 1\n",
    "avg_bmi_diabetes = df_diabetes['BMI'].mean()\n",
    "df_no_diabetes = df[?] # Filter rows where Outcome == 0\n",
    "avg_bmi_no_diabetes = df_no_diabetes['BMI'].mean()\n",
    "print(\"Average BMI of patients with diabetes:\", avg_bmi_diabetes)\n",
    "print(\"Average BMI of patients without diabetes:\", avg_bmi_no_diabetes)\n",
    "```\n",
    "\n"
   ],
   "id": "2d0b736b2279c84"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "outputs": [],
   "execution_count": null,
   "source": "",
   "id": "edcf967431752fb"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "4. Complete the code below to find cases where the number of pregnancies is greater than 2 and the glucose level is above 150.\n",
    "```python\n",
    "filtered_df = df.query(?) # Filter rows where Pregnancies > 2 and Glucose > 150\n",
    "print(filtered_df.describe())\n",
    "```"
   ],
   "id": "5a784bb190830671"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "outputs": [],
   "execution_count": null,
   "source": "",
   "id": "bafc8fec86cafdd8"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "5. Complete the code below to find cases where patients are older than 50 or have a BMI greater than 30.\n",
    "\n",
    "```python\n",
    "filtered_df = df.query(?) # Filter rows where Age > 50 or BMI > 30\n",
    "print(filtered_df.describe())\n",
    "```"
   ],
   "id": "b98b80bac80ea541"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "outputs": [],
   "execution_count": null,
   "source": "",
   "id": "44478569d37423a"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "The next exercises use the COVID dataset, which contains information about the COVID-19 pandemic. The dataset contains the following columns:\n",
    "- Date: Date of the record\n",
    "- Country: Country name\n",
    "- Confirmed: Cumulative number of confirmed cases\n",
    "- Recovered: Cumulative number of recovered cases\n",
    "- Deaths: Cumulative number of deaths\n",
    "\n",
    "\n",
    "The following code loads the dataset into a Pandas dataframe:"
   ],
   "id": "eea68429a4045957"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-12-25T18:58:59.640588Z",
     "start_time": "2025-12-25T18:58:41.261693Z"
    }
   },
   "cell_type": "code",
   "source": [
    "covid_pd = pd.read_csv('https://raw.githubusercontent.com/ffraile/computer_science_tutorials/main/source/Data%20Manipulation/exercises/datasets/covid.csv')\n",
    "\n",
    "covid_pd"
   ],
   "id": "f59dadff8bc906b3",
   "outputs": [
    {
     "data": {
      "text/plain": [
       "              Date      Country  Confirmed  Recovered  Deaths\n",
       "0       2020-01-22  Afghanistan          0          0       0\n",
       "1       2020-01-23  Afghanistan          0          0       0\n",
       "2       2020-01-24  Afghanistan          0          0       0\n",
       "3       2020-01-25  Afghanistan          0          0       0\n",
       "4       2020-01-26  Afghanistan          0          0       0\n",
       "...            ...          ...        ...        ...     ...\n",
       "143663  2022-01-19     Zimbabwe     226887          0    5266\n",
       "143664  2022-01-20     Zimbabwe     227552          0    5276\n",
       "143665  2022-01-21     Zimbabwe     227961          0    5288\n",
       "143666  2022-01-22     Zimbabwe     228179          0    5292\n",
       "143667  2022-01-23     Zimbabwe     228254          0    5294\n",
       "\n",
       "[143668 rows x 5 columns]"
      ],
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Date</th>\n",
       "      <th>Country</th>\n",
       "      <th>Confirmed</th>\n",
       "      <th>Recovered</th>\n",
       "      <th>Deaths</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2020-01-22</td>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2020-01-23</td>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2020-01-24</td>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2020-01-25</td>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2020-01-26</td>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>143663</th>\n",
       "      <td>2022-01-19</td>\n",
       "      <td>Zimbabwe</td>\n",
       "      <td>226887</td>\n",
       "      <td>0</td>\n",
       "      <td>5266</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>143664</th>\n",
       "      <td>2022-01-20</td>\n",
       "      <td>Zimbabwe</td>\n",
       "      <td>227552</td>\n",
       "      <td>0</td>\n",
       "      <td>5276</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>143665</th>\n",
       "      <td>2022-01-21</td>\n",
       "      <td>Zimbabwe</td>\n",
       "      <td>227961</td>\n",
       "      <td>0</td>\n",
       "      <td>5288</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>143666</th>\n",
       "      <td>2022-01-22</td>\n",
       "      <td>Zimbabwe</td>\n",
       "      <td>228179</td>\n",
       "      <td>0</td>\n",
       "      <td>5292</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>143667</th>\n",
       "      <td>2022-01-23</td>\n",
       "      <td>Zimbabwe</td>\n",
       "      <td>228254</td>\n",
       "      <td>0</td>\n",
       "      <td>5294</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>143668 rows × 5 columns</p>\n",
       "</div>"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "execution_count": 2
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "6. Complete the code below to obtain the time series of Deaths in Spain.\n",
    "```python\n",
    "spain_deaths = covid_pd[?] # Filter rows where Country == 'Spain'\n",
    "print(spain_deaths[['Date', 'Deaths']])\n",
    "```"
   ],
   "id": "26f07e060270c7a8"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "outputs": [],
   "execution_count": null,
   "source": "",
   "id": "ec866da50b1d5c1a"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "7. Complete the code below to extract the Confirmed cases in China in 2020.\n",
    "```python\n",
    "china_2020 = covid_pd.query() # Filter rows where Country == 'China' and Date is in 2020\n",
    "print(china_2020[['Date', 'Confirmed']])\n",
    "```"
   ],
   "id": "e533677ea0d44837"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "outputs": [],
   "execution_count": null,
   "source": "",
   "id": "916e04582aa38e4"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "8. Knowing that the function ```diff()``` computes the difference between consecutive rows,  and that ```idmax()``` returns the index of the row with max value, complete the code below to find the date with the highest dead toll in the US.\n",
    "```python\n",
    "us_deaths = covid_pd[?].copy() # Filter rows where Country == 'US'\n",
    "us_deaths['Daily_Deaths'] = us_deaths['Deaths'].diff()\n",
    "max_death_date = us_deaths.loc[us_deaths['Daily_Deaths'].idxmax()]\n",
    "print(\"Date with highest daily deaths in US:\", max_death_date['Date'], \"with\", max_death_date['Daily_Deaths'], \"deaths\")\n",
    "```"
   ],
   "id": "677baaba26196a89"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-12-25T19:08:18.118049Z",
     "start_time": "2025-12-25T19:08:18.109202Z"
    }
   },
   "cell_type": "code",
   "source": "",
   "id": "eeef391fa6f4189a",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Date with highest daily deaths in US: 2021-01-20 with 4442.0 deaths\n"
     ]
    }
   ],
   "execution_count": 4
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "The next exercises use a dataframe that loads data from the following dictionary:\n",
    "```python\n",
    "data = [\n",
    "    {\"id\": 101, \"name\": \"Scott Summers\", \"alias\": \"Cyclops\", \"program\":\"CS\", \"score\": 9.5},\n",
    "    {\"id\": 102, \"name\": \"Jean Grey\", \"alias\": \"Phoenix\", \"program\":\"CS\", \"score\": 8.7},\n",
    "    {\"id\": 103, \"name\": \"Logan Howlett\", \"alias\": \"Wolverine\", \"program\":\"CS\", \"score\": 7.8},\n",
    "    {\"id\": 104, \"name\": \"Ororo Munroe\", \"alias\": \"Storm\", \"program\":\"CS\", \"score\": 8.9},\n",
    "    {\"id\": 105, \"name\": \"Charles Xavier\", \"alias\": \"Professor X\", \"program\":\"CS\", \"score\": 10},\n",
    "    {\"id\": 101, \"name\": \"Scott Summers\", \"alias\": \"Cyclops\", \"program\":\"Physics\", \"score\": 10.0},\n",
    "    {\"id\": 102, \"name\": \"Jean Grey\", \"alias\": \"Phoenix\", \"program\":\"Physics\", \"score\": 9.5},\n",
    "    {\"id\": 103, \"name\": \"Logan Howlett\", \"alias\": \"Wolverine\", \"program\":\"Physics\", \"score\": 8.0},\n",
    "    {\"id\": 104, \"name\": \"Ororo Munroe\", \"alias\": \"Storm\", \"program\":\"Physics\", \"score\": 9.1},\n",
    "    {\"id\": 105, \"name\": \"Charles Xavier\", \"alias\": \"Professor X\", \"program\":\"Physics\", \"score\": 10.0}\n",
    "]\n",
    "```\n",
    "The following code loads the dataset into a Pandas dataframe:"
   ],
   "id": "2888da3b8063ad6a"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-12-25T19:15:44.511453Z",
     "start_time": "2025-12-25T19:15:44.506267Z"
    }
   },
   "cell_type": "code",
   "source": [
    "data = [\n",
    "    {\"id\": 101, \"name\": \"Scott Summers\", \"alias\": \"Cyclops\", \"program\":\"CS\", \"score\": 9.5},\n",
    "    {\"id\": 102, \"name\": \"Jean Grey\", \"alias\": \"Phoenix\", \"program\":\"CS\", \"score\": 8.7},\n",
    "    {\"id\": 103, \"name\": \"Logan Howlett\", \"alias\": \"Wolverine\", \"program\":\"CS\", \"score\": 7.8},\n",
    "    {\"id\": 104, \"name\": \"Ororo Munroe\", \"alias\": \"Storm\", \"program\":\"CS\", \"score\": 8.9},\n",
    "    {\"id\": 105, \"name\": \"Charles Xavier\", \"alias\": \"Professor X\", \"program\":\"CS\", \"score\": 10},\n",
    "    {\"id\": 101, \"name\": \"Scott Summers\", \"alias\": \"Cyclops\", \"program\":\"Physics\", \"score\": 10.0},\n",
    "    {\"id\": 102, \"name\": \"Jean Grey\", \"alias\": \"Phoenix\", \"program\":\"Physics\", \"score\": 9.5},\n",
    "    {\"id\": 103, \"name\": \"Logan Howlett\", \"alias\": \"Wolverine\", \"program\":\"Physics\", \"score\": 8.0},\n",
    "    {\"id\": 104, \"name\": \"Ororo Munroe\", \"alias\": \"Storm\", \"program\":\"Physics\", \"score\": 9.1},\n",
    "    {\"id\": 105, \"name\": \"Charles Xavier\", \"alias\": \"Professor X\", \"program\":\"Physics\", \"score\": 10.0}\n",
    "]\n",
    "df1 = pd.DataFrame(data)\n",
    "df1"
   ],
   "id": "421be74b6106e117",
   "outputs": [
    {
     "data": {
      "text/plain": [
       "    id            name        alias  program  score\n",
       "0  101   Scott Summers      Cyclops       CS    9.5\n",
       "1  102       Jean Grey      Phoenix       CS    8.7\n",
       "2  103   Logan Howlett    Wolverine       CS    7.8\n",
       "3  104    Ororo Munroe        Storm       CS    8.9\n",
       "4  105  Charles Xavier  Professor X       CS   10.0\n",
       "5  101   Scott Summers      Cyclops  Physics   10.0\n",
       "6  102       Jean Grey      Phoenix  Physics    9.5\n",
       "7  103   Logan Howlett    Wolverine  Physics    8.0\n",
       "8  104    Ororo Munroe        Storm  Physics    9.1\n",
       "9  105  Charles Xavier  Professor X  Physics   10.0"
      ],
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>name</th>\n",
       "      <th>alias</th>\n",
       "      <th>program</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>101</td>\n",
       "      <td>Scott Summers</td>\n",
       "      <td>Cyclops</td>\n",
       "      <td>CS</td>\n",
       "      <td>9.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>102</td>\n",
       "      <td>Jean Grey</td>\n",
       "      <td>Phoenix</td>\n",
       "      <td>CS</td>\n",
       "      <td>8.7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>103</td>\n",
       "      <td>Logan Howlett</td>\n",
       "      <td>Wolverine</td>\n",
       "      <td>CS</td>\n",
       "      <td>7.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>104</td>\n",
       "      <td>Ororo Munroe</td>\n",
       "      <td>Storm</td>\n",
       "      <td>CS</td>\n",
       "      <td>8.9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>105</td>\n",
       "      <td>Charles Xavier</td>\n",
       "      <td>Professor X</td>\n",
       "      <td>CS</td>\n",
       "      <td>10.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>101</td>\n",
       "      <td>Scott Summers</td>\n",
       "      <td>Cyclops</td>\n",
       "      <td>Physics</td>\n",
       "      <td>10.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>102</td>\n",
       "      <td>Jean Grey</td>\n",
       "      <td>Phoenix</td>\n",
       "      <td>Physics</td>\n",
       "      <td>9.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>103</td>\n",
       "      <td>Logan Howlett</td>\n",
       "      <td>Wolverine</td>\n",
       "      <td>Physics</td>\n",
       "      <td>8.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>104</td>\n",
       "      <td>Ororo Munroe</td>\n",
       "      <td>Storm</td>\n",
       "      <td>Physics</td>\n",
       "      <td>9.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>105</td>\n",
       "      <td>Charles Xavier</td>\n",
       "      <td>Professor X</td>\n",
       "      <td>Physics</td>\n",
       "      <td>10.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "execution_count": 5
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "9. Predict the output of this code:\n",
    "\n",
    "```python\n",
    "df1_cs = df1[alias == 'Cyclops']\n",
    "print(df1_cs)\n",
    "```\n",
    "\n"
   ],
   "id": "b5b26907d045642a"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "outputs": [],
   "execution_count": null,
   "source": "",
   "id": "d64de17f97127143"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "10. Predict the output of this code:\n",
    "```python\n",
    "df1_high = df1[df1['score'] > 9.0]\n",
    "print(df1_high)\n",
    "```"
   ],
   "id": "638e51fadd882df8"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "outputs": [],
   "execution_count": null,
   "source": "",
   "id": "ddf3666dc6524f04"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "11. Complete the code below to find the average score of students in the CS program.\n",
    "```python\n",
    "df1_cs = df1[?] # Filter rows where program == 'CS'\n",
    "avg_score_cs = df1_cs['score'].mean()\n",
    "print(\"Average score in CS program:\", avg_score_cs)\n",
    "```"
   ],
   "id": "f2d9a3237dc295c"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "outputs": [],
   "execution_count": null,
   "source": "",
   "id": "e9acc2cf6d594c45"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "12. Complete the code below to find students with a score greater than 9.0 in the Physics program.\n",
    "```python\n",
    "df1_physics_high = df1.query(?)# Filter rows where program is Physics and score is greater than  9.0\n",
    "print(df1_physics_high)\n",
    "```"
   ],
   "id": "89d9920bf365cb52"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "outputs": [],
   "execution_count": null,
   "source": "",
   "id": "584a0996e7be3c7e"
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}